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Abstract 

The quantum analogues of classical variable-length codes are indeterminate- 
length quantum codes, in which codewords may exist in superpositions of 
different lengths. This paper explores some of their properties. The length 
observable for such codes is governed by a quantum version of the Kraft- 
McMillan inequality. Indeterminate-length quantum codes also provide an 
alternate approach to quantum data compression. 

Introduction 

The development of quantum information theory is a striking example of 
the fruitful hybridization of two well-established disciplines. Both quantum 
mechanics and information theory have a rich set of concepts and a powerful 
toolbox of mathematical techniques. Their combination is yielding powerful 
insights into the physical meaning of "information" [|J [| . 

One approach to this exploration is to begin with an idea of "classical" 
information theory and investigate how this idea must be re-interpreted or 
modified to fit into the quantum information framework. Ideas of fidelity, 
quantum data compression B, quantum error correcting codes H, and the 



capacities of various quantum channels [10] can all be viewed in this light. 
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A basic idea in the classical theory of data compression is the idea of a 
variable-length code. A variable-length code assigns to different messages 
codewords consisting of different numbers of symbols. If shorter codewords 
are used for more common messages and longer ones for less common mes- 
sages, the average codeword length can be made shorter than would be pos- 
sible using a fixed-length code. (Natural languages take advantage of this 
idea. Common words like "the" are often very short, while unusual words 
like "sesquipedalian" are longer.) 

However, the original development of quantum data compression followed 
a different route, parallel to the classical development based on "typical se- 
quences" . This left open the question of whether there was a quantum ana- 
logue to classical variable-length coding. Because a quantum code must al- 
low superpositions of different codewords — including superpositions of code- 
words of different lengths — the quantum version would best be termed an 
indeterminate-length quantum code. 

One of us |3J made a preliminary investigation of this idea several years 
ago. Subsequently, Braunstein et al. || presented a quantum analogue to 
classical Huffman coding. Because a general understanding of indeterminate- 
length quantum codes was not available then, Braunstein et al. were led to 
construct their code in an unnecessarily inefficient way. (See the discussion 
in Section |2.5| , below.) More recently, Chuang and Modha have developed 
a quantum version of arithmetic coding as a route to quantum data com- 
pression [0]. Bostrom has also investigated indeterminate-length codes in 
connection with lossless quantum coding 0. 

Our aim in this paper is to outline a general theory of indeterminate- 
length quantum codes, including their application to quantum data com- 
pression. 

We will first sketch a framework for discussing such codes. Each code will 
have a "codeword length" observable A with integer eigenvalues; allowable 
codewords include not only length eigenstates but arbitrary superpositions 
of them. The key requirement is that such codes be "condensable" — that 
is, that the individual codewords can be assembled into a string by means 
of a unitary operation. This condition leads us to prove a quantum version 
of the Kraft-McMillan inequality. Among the condensable codes are those 
that satisfy a quantum "prefix-free" condition, and we show (by giving an 
explicit condensation algorithm) that all such codes are condensable. We also 
show how classical variable-length codes can be used to construct quantum 
indeterminate-length codes with analogous properties. 
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We next turn to the use of indeterminate-length codes for quantum data 
compression. We achieve quantum data compression by taking a condensed 
string of N codewords (in general no shorter than N times the largest eigen- 
value of A) and truncating it after the first N£ qubits, thus using only £ qubits 
per input codeword. We show that the average (/) of the codeword length 
observable A is the necessary and sufficient value of £ to achieve high fidelity 
for this process. It turns out that (I) is related to the quantum entropy S of 
the quantum information source, and from this relation we are able to arrive 
at the noiseless quantum coding theorem. 

1 Indeterminate-length codes 
1.1 Zero-extended forms 

In a quantum code, codewords are states of finite strings of qubits. Superpo- 
sitions of codewords are also valid codewords, and to maintain high fidelity 
we must preserve the coherence of these superpositions in our coding and 
decoding processes. 

We wish to create a code in which different codewords have different 
lengths — that is, they involve different numbers of qubits. But how do we 
make sense of this idea? We'll begin by considering zero- extended forms (zef 
) of the codewords. For zef codewords, we imagine that the codewords are 
sitting at the beginning of a qubit register of fixed length, with |0)'s following. 
These codewords span a subspace of the Hilbert space of register states. 

Our first essential requirement is that the codewords carry their own 
length information. That is, we require that there is a "length" observable 
A on the zef codeword subspace with the following two properties: 

• The eigenvalues of A are 1, . . . , l ma x, where l max is the length of the 



• If iV'zef) is an eigenstate of A with eigenvalue /, then it has the form 



In other words, the last l max — I qubits in the register are in the state 
|0) for a zef codeword of length I. 



register. 




(1) 
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The length observable A was also considered in J7|. 

For each I = 1, . . . ,l max , we let di be the dimension of the subspace 
spanned by the A -eigenstates with eigenvalue I. Denote the projection onto 
this subspace by 717. Then Tr^ = d\. 



1.2 Condensable codes 

We want to be able to make use of the comparative shortness of some code- 
words by "packing" the codewords together, eliminating the trailing zeroes 
that "pad" the ends of the zef codewords. But this must be a process that 
maintains quantum coherences in superpositions of codeword states — that is, 
it must be described by a unitary transformation. Furthermore, we wish to 
be able to coherently pack together any number of codewords. 

We say that a code is condensable if the following condition holds: For 
any N, there is a unitary operator U (depending on N) that maps 

j^l.zef) 8-8 \^N,zef\ ~> |#l...tf,*ef) (2) 

Wma^qubits Nimaxqubits 

with the property that, if the individual codewords are all length eigenstates, 
then U maps the codewords to a zef string of the Nl max qubits — that is, one 
with |0)'s after the first L — l\ H h qubits: 

This process is called condensation. Since every codeword is a superposi- 
tion of length eigenstates, it suffices to specify how the condensation process 
functions for such codewords. 

Note that we have made no assumptions about the details of the con- 
densation process. In the most straightforward case, condensation would be 
accomplished by concatenation of the codewords. The condensed state in 
Equation |3| would be of the form 



71 



^\" h ■ ■ ■ ^ L N ~ lN+1 '" L L+1 - m ™* ) . (4) 



This special type of condensation is called simple condensation, and those 
codes whose codewords can be condensed in this way are said to be simply 
condensable codes. Obviously, all simply condensable codes are condensable; 
but the converse is not true. 
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The condensability condition is phrased as an "encoding" requirement, 
but the unitary character of the packing process automatically yields a de- 
coding condition — we can unpack a condensed string by applying the U^ 1 
transformation. 

It is interesting to compare the analogous classical situation. Classi- 
cal codewords in a variable-length code can always be concatenated into a 
"packed" string. Only for uniquely decipherable codes is this packing re- 
versible. In the quantum case, since arbitrary superpositions of codewords 
are also legal codewords, the concatenation process itself must be unitary. 
This automatically implies that it can be reversed. 



1.3 The quantum Kraft-McMillan inequality 



Given that the codewords carry their own length information and form a con- 
densable code, we next derive a condition on the codeword length observable. 
Fix a value of N and consider all codeword strings that have given values of 
li, I2, ■ ■ ■ , In- These states lie in a subspace of dimension d^di 2 • • - d\ 



l-L 



Q L+1- 



N , and 



but 



all of them are mapped by U into something of the form 

Next, imagine strings of codewords with different lengths l[, 1' 2 , . . . ,l' N , 
whose lengths sum to the same total length: V = L. The space spanned by 
these has dimension d^d^ ■ ■ ■ dy N and is orthogonal to the previous space. We 
can consider all such combinations of lengths that sum to the same L. Each 



of these states maps under U to something of the form 
so we obtain 



l-L 



QL+1- 



■Nlr, 



dimension of space 
< I containing all strings 

qfl - L qL+1-NI 

mi 



dimension of space 
containing all codeword 
strings with the same L 



E d h ---d lN <2 L . 

h+-+l N =L 

It follows that 

2- L ]T d h ---d lN = £ (2-%) • • • (2~ l »d lN ) < 1. 



h+-+l N =L 



h+-+l N =L 



There are at most Nl max possible values of L. If we sum both sides of this 
equation over those values, the resulting sum on the left-hand side will include 
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all possible values of l±, . . . , l N . Therefore, 

E (2~%) • • • (2- lN d lN ) = (j2 2"'di) < Nl max . 

This is of the form K N < Nl max . If K > 1, then this inequality must be 
violated for sufficiently large N. Thus, we conclude that K < 1. But 

JT = E 2-'^ = E 2~'Tr vr, = Tr 2~0 ■ 
1 1 \ 1 / 

This gives us our quantum version of the Kraft-McMillan inequality. For any 
indeterminate-length quantum code that is condensable, the length observ- 
able A on zef codewords must satisfy 

Tr2~ A <l (5) 
(where the trace is taken over the subspace of zef codewords). 

1.4 Prefix-free codes 

An alternate condition that we might impose on our indeterminate-length 
quantum code is that the code be prefix-free — informally, that no initial seg- 
ment of a zef codeword is itself a codeword. In the next section, we will show 
that all prefix-free codes are simply condensable. In this section, we will 
discuss the meaning of the prefix-free condition and show that any condens- 
able code can be transformed into a prefix-free code with the same length 
characteristics. 

Suppose and \ip 2 ) are length eigenstate zef codewords with lengths 
l\ and I2, respectively; and further suppose that l 2 > h- These states have 
the form 

|^ 2 ) = ^-h h+l-l max \ (Q) 



For the codeword the quantum state of the first l\ qubits of the register 
is just the pure state ipl'" 11 ^. For the codeword ^2), the first l\ qubits may 
be in a mixed state, described by the density operator 

p\ h = Tr h+1 ... lmax \ip 2 )y, 2 \ 



Tr 



h+i-h 



^■■ h )(^- h ■ (7) 
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We say that our code is prefix-free if, for all such pairs of codewords, 



^-'i pY- V = o. (8) 

In other words, the first l\ qubits of a codeword of length l\ have a state that 
is orthogonal to the (possibly mixed) state of the first l\ qubits of a codeword 
of length li> l\. 

Another way of expressing this condition is to say that a length eigenstate 
zef codeword of length I can be distinguished from a codeword of greater 
length by making a measurement only on the first I qubits. Shorter codewords 
can be "recognized" from shorter segments. This means that the operator 
tti, the projection onto the subspace of length eigenstate zef codewords of 
length I, has the form 

m = it 1 - 1 ® (9) 

Of course, actually to measure the codeword length A would be disas- 
trous, because such a real measurement would destroy the coherence of su- 
perpositions of length eigenstates without possibility of restoration. The 
condensation process must therefore not include any measurement of length 
information. On the other hand, the process may include interactions by 
which, at some intermediate stage, a quantum computer has become entan- 
gled with codeword length information — provided that, by the end of the 
computation, this entanglement has been eliminated. In Section |1.5| we dis- 
cuss this in more detail. 

A particularly simple way of generating a prefix-free quantum code is to 
use a classical prefix-free code as a basis for the zef codeword subspace. For 
example, the classical codewords 0, 10, 110 and 111 form a prefix-free set. 
The corresponding quantum code can be specified by giving an orthogonal 
basis of length eigenstate zef codewords, as follows: 

state length 
1 000) 1 
1 100) 2 
|110) 3 
jlll) 3 

The length observable A for this code is 

A = |000)(000| + 2 1 100) <100| + 3 |110)(110| + 3 |111><111| . (10) 
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Of course, any superposition of these is also a zef codeword, though not 
necessarily a codeword of definite A. It is easy to verify that the code defined 
in this way satisfies the criterion for a quantum "prefix-free" code. The 
procedure illustrated here may be extended in the obvious way to create a 
quantum prefix-free code from any classical prefix-free code. 

Suppose we have a indeterminate-length quantum code that satisfies the 
quantum Kraft-McMillan inequality. Then the space of zef codewords of this 
code is spanned by a basis of eigenstates of this code's length observable A. 
Let |V>zef;Z,i) be the ith such basis vector with length eigenvalue I, and let n/ be 
the number of basis vectors that have length /. (Thus for a given /, % ranges 
from 1 to rfy.) The quantum Kraft-McMillan inequality (Eq. |5|) implies that 

$>2-<<l. (11) 



Given values of / and ni that satisfy Eq. [11], we can construct a classical 
prefix-free code with n\ distinct codewords of length / bits. (In this case, 



Eq. 11 is just the classical Kraft inequality.) Denote by the ith codeword 
of length / bits in this prefix-free code. We use this classical prefix-free 
code to create a quantum prefix-free code by constructing a basis of length 



eigenstates, whose elements are 
Now consider the mapping 



|VW JM ) - C Ui Q l+l - 1 ™*) . (12) 



This is a mapping from orthogonal basis vectors to orthogonal basis vectors 
that can be extended linearly to a unitary mapping V on the entire Hilbert 
space. V takes the original codewords to prefix-free codewords in a length- 
preserving way — that is, the length observable A' of the prefix-free code is 
given by A' = VAl^. In short, any quantum code that satisfies Equation ^ 
can be unitarily mapped to a prefix-free quantum code with identical length 
characteristics. 

Are all prefix-free quantum codes condensable? As we shall see in Sec- 
tion |1.6| , they are; but in order to show this, we will have to give an explicit 
algorithm for a quantum computer to condense the codewords of a prefix-free 
quantum code. This algorithm must maintain the coherence of superposi- 
tions of codewords of different lengths. Before we describe our algorithm, we 
will first discuss some key characteristics of coherent information processing. 
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1.5 Coherence and reversibility 

We adopt a high-level model of a quantum computer, which could in principle 
be implemented by a quantum Turing machine or an array of quantum gates. 
Our quantum computer contains several registers of qubits, which initially 
hold zef codewords from a prefix-free quantum code. The computer also 
includes a central processing unit that contains various counters and pointers, 
each of which can take on integer values (or superpositions of these). A 
system clock keeps track of the number of machine cycles that have passed 
since the beginning of the computation. (This clock may be treated as an 
entirely classical system; its function is simply to control the execution of 
our quantum program.) Finally, the computer contains an output "tape" of 
qubits (initially all in the state |0)) on which the condensed string is to be 
written. 

Our job is to write the input code words onto the output tape in a way 
that preserves the coherence of superpositions of different codewords, includ- 
ing superpositions of codewords of different lengths. This means that the 
operation of the computer must be unitary. We can guarantee this unitarity 
if we satisfy certain conditions: 

1. Reversibility. In a classical code, all codewords have a determinate 
length. We can choose an orthogonal basis of length eigenstates to 
be "quasi-classical" input states of our computer. (These states need 
not be fully classical — for example, the qubits in these codeword states 
may be entangled with each other. However, each codeword in our basis 
has a determinate length.) We require that distinct "quasi-classical" 
inputs lead to distinct final states of the computer. This is essentially a 
requirement that the computation be reversible on these quasi-classical 



inputs 11, 12 



2. Coherent computation. The computation includes no measurement or 
process in which the environment becomes entangled with the com- 
puter. As a special case of this, we require that the computation end 
after exactly the same number of steps for any input codeword. If the 
computation took more steps for longer codewords, the halting time of 
the computation would constitute a measurement of codeword length, 
and would destroy the coherence. 

3. Localization of coherence in the output. For any quasi-classical input, 
at the end of the computation all input registers and internal variables 
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in the central processor have been reset to fixed values that are inde- 
pendent of the input. Only the output tape retains any information 
about the input. This will guarantee that a superposition of quasi- 
classical inputs will not lead to entanglement between the output tape 
and the rest of the computer; the coherence will be localized in the 
output tape. 

A similar set of conditions is outlined in || , where it is used to specify quan- 
tum algorithms for data compression and for quantum arithmetic coding. 

The reversibility requirement ensures that an orthogonal basis of initial 
states maps to an orthogonal basis of final states. If the computation is coher- 
ent, this map extends by linearity to a unitary evolution for the computer's 
quantum state. The final requirement guarantees that the quantum infor- 
mation initially in the input registers can be recovered from the condensed 
output tape alone. We will discuss each of these requirements in turn. 

Consider how our quantum computer acts on quasi-classical (length eigen- 
state) inputs. If we were to map out its algorithm as a flowchart, the require- 
ment of reversibility would impose two sorts of requirements. First, each 
individual operation on the data must be reversible. Second, the branches 
and joins in the flowchart must be specified in a reversible way. 

A branch can be pictured in this way: 

I 

branch f a ^ se 
condition 

true 

Execution of the program enters from the top, and a logical "branch condi- 
tion" is evaluated. If the branch condition is true, execution proceeds along 
the downward branch; if false, along the rightward branch. This is plainly 
reversible, as long as the evaluation of the branch condition is done in a re- 
versible way; there is no ambiguity in the execution of the reversed program. 
However, a simple join 



join 
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is not reversible, since in the reversed program it is not clear which of the 
two paths to take. The point is that a join in the flowchart is a reversed 
branch, and thus must be governed by a logical "join condition": 

| true 

false join 

condition 



The program is designed so that the "join condition" is true whenever the 
execution approaches from above, and false whenever execution approaches 
from the right. 

In our program, we will want to use branches and joins to create "loops", 
like so: 

true 

start f alse 
condition 



operations 



Stop false 
condition 

| true 

The "start condition" is a logical condition that is only true at the beginning 
of the execution of the loop and not thereafter; the "stop condition" is only 
true at the end of the execution of the loop and not before. 

We can also conveniently represent the reversible loop structure in pseu- 
docode form: 

loop enter (start condition) 



11 



operations 
loop exit (stop condition) 

Both the beginning and the end of the loop are governed by logical conditions. 

The requirement that the computation be coherent may at first seem 
difficult to achieve, since each branch point (or join point) in the algorithm 
involves the evaluation of a condition — apparently a measurement process. 
However, these conditions can control the execution of the program without 
any irreversible loss of coherence. 

Let us suppose that the quantum system Q is some portion of our com- 
puter, and that we wish to branch our program based on a condition about 
the state of Q. The condition is represented by a projection II acting on the 
Hilbert space Ti describing Q. Any initial state \ip) of Q can be written as 

|^> = n|^> + n J -|^>. (13) 

We could imagine evaluating the condition by making a measurement of the 
observable represented by II and IT- 1 . But this would destroy the coherence 
in this superposition, so a less destructive operation is required. 

We join to Q a single qubit (in another part of the quantum computer), 
and consider the operator U on joint system: 

f/ = (|o)(i| + |i)(o|)®n + i®n ± . (14) 

U is easily verified to be a unitary operator, and thus it could represent some 
coherent quantum evolution of the joint system. If the qubit is initially set 
to the state |0) and then U acts, we obtain, 

u |o> ® |v>> = |i> ® n + |o> ® n x |v>) . (15) 

This is an entangled state of the qubit and Q. If we were to make a measure- 
ment of the qubit in the standard basis, we would be effectively measuring 
the observable II on Q. That is, the qubit "contains" the value of the observ- 
able II. However, the interaction is completely reversible, and in this case 
may be undone by a further application of U itself. 

The qubit can be used as a switch to instruct the computer which branch 
of the computation to follow. Suppose we wish to specify that, if the qubit is 
|0), the rest of the computer performs a computation described by the unitary 
operator Vq, whereas if the qubit is |1) then we wish to do the computation 
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V\. Then we instruct the entire computer (including the switch qubit) to 
perform a coherent computation described by the unitary operator 



v = |o)(o| ®y + |i)(i| ®v x . (i6) 

If the overall state of the computer is a superposition of the two switch states, 
both branches are followed in different branches of the superposition. The 
computer may become increasingly entangled, but the coherence of its overall 
state is preserved. 

We have shown that any branching condition that can be represented by 
a projection operator IT can be used to control the execution of the program 
without any necessary loss of coherence. The cost is entanglement among 
the parts of the computer. 

A join point in the algorithm is simply a time-reversed branch point. 
Just before the join, the computer is in a state like Equation [0^, in which 
the qubit is entangled with the system Q. The operator U^ 1 = U acts, 
and we return the qubit to the state |0) and the system Q to a state like 



Equation [13]. We have "disentangled" Q from the qubit, so the two branches 
of the computation (controlled by the qubit) have merged. 

Our second concern with coherent computation is the synchronization of 
the computation on different components of the initial superposition. This 
can be maintained without much difficulty by introducing appropriate "delay 
loops" into the program, so that its execution requires exactly the same 
number of machine cycles for any input. 

We will address our final concern, that the output tape should wind up 
unentangled with the rest of the computer, by showing that the final state of 
the rest of the computer (input registers and central processor) is independent 
of the input state. 



1.6 Prefix-free codes are simply condensable 

We are at last ready to give our algorithm for simply condensing the code- 
words of a prefix-free quantum code. First, we establish our notation and 
describe the contents of our computer in slightly more detail: 

Registers Our computer contains N registers, each consisting of l max qubits. 
The ith register is denoted B4 and the kth qubit of this register is called 
Ri^k- Initially, each register contains a zef codeword from a fixed prefix- 
free quantum code. 
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Tape There is a "tape" T containing at least Nl max qubits, all of which are 
initially in the state |0). The nth qubit in the tape is called T n . 

Counter There is a counter variable c, which can take on integer values 
starting with (or, of course, superpositions of these). The initial 
state of c is |0). 

Pointers There are several pointer variables, which like the counter variable 
take on integer values and have an initial state |0). These variables 
point to locations in the computer's memory, but of course they are 
themselves quantum variables and can take on entangled superpositions 
of values. There is an overall register pointer r and, for each register, 
a qubit pointer q; L (for the ith register). The tape also has a pointer 
variable p. 

The first section of the program copies the contents of the registers to the 
tapes, moving the pointers in the process. 

loop enter (r = 0) 
r <— r + 1 
loop enter (q r = 0) 
q r <— q r + 1 
p <— p + 1 

Tp < Tp Q) R r ,q r 

loop exit (Rr,i • • • Rr,q r is a codeword of length q r ) 
loop exit (r = N) 

(The notation a <— a © b indicates the "controlled not" operation on the 
qubits, with a as the "target" and b as the "control" qubit.) Notice that the 
exit condition for the inner loop (which copies the register qubits one by one 
onto the tape) is legitimate because the code is prefix-free. This means that 
the question of whether the first q r qubits of the register form a codeword of 
length q r can be settled by measuring a projection-type observable on those 
qubits. (The computer does not make such a measurement, but instead 
coherently controls its operation as described above.) 

We also note that, since the procedure is just to copy the register contents 
to the output tape, we are doing simple condensation. 

At this stage, the various pointer variables are entangled with the code- 
word length information; furthermore, the time at which the computer reaches 
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this stage of the computation is indeterminate. We now resynchronize the 
program via a delay loop that causes the computer to "idle" until a fixed 
time D (chosen large enough so that the first section of the program has 
finished even for the longest possible input codewords). 

loop enter (c = 0) 

c <- c+ 1 
loop exit (time = D) 

The second half of the program is the reverse of the first half, except that 
the register is uncopied, rather than the tape. 

loop enter (time = D + 1) 

c <— c — 1 
loop exit (c = 0) 
loop enter (r = N) 

loop enter (i? r l • • • R r qr is a codeword of length q r ) 

-Rr,q r < Tp © Rr,q r 
p <— p — 1 

loop exit (q r = 0) 
r <— r — 1 
loop exit (r = 0) 

The program now ends, after exactly 2D machine steps. All pointers and 
counters have been returned to their initial zero values, and the input qubit 
registers have been reset to 1 00 • • • 0). Only the qubit tape now contains any 
non-zero data, in the form of a simply condensed string of N codewords. In 
short, the computer at the end retains no codeword-length information at all. 
Superpositions of codewords of different length will thus remain coherent in 
the condensation process. Since the algorithm works for any given N, the 
prefix-free quantum code is simply condensable. 

We previously proved that every condensable code satisfies the quantum 
Kraft-McMillan inequality, and then that every quantum code that satisfies 
the Kraft-McMillan inequality can be unitarily remapped to a prefix-free 
code. We now learn that prefix-free quantum codes are simply condensable. 
Since unitary remapping might be part of a general condensation process, we 
have established that a quantum code is condensable if and only if it satisfies 
the quantum Kraft-McMillan inequality. 
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2 Quantum data compression 



2.1 How many qubits? 

Classical variable-length codes are used for data compression — that is, the 
representation of classical information in a compact way, using as few re- 
sources (bits) as possible. This is done by encoding more probable mes- 
sages in shorter codewords, so that the average codeword length is mini- 
mized. In this section we will discuss how — and in what sense — quantum 
indeterminate-length codes may be used for quantum data compression. 

Suppose Alice is sending classical information to Bob using the following 
classical variable-length code: 



If the message C\ is sent, Bob receives a signal consisting of a single bit (0); 
but if C 4 is sent, he receives three bits (111). In each case, Bob knows how 
many bits are being used to send the message. If a long string of messages is 
being sent, Bob at any stage knows how many complete messages have been 
received. 

Bob learns the length of each codeword because he actually learns which 
codeword was sent. The fact that Bob learns the identity of each codeword 
is not a problem in the classical situation; indeed, it is the whole point of 
classical communication! This contrasts with quantum information transfer. 
If Alice's signals, for example, are drawn from a non-orthogonal set of states, 
Bob will not be able to determine reliably which signal was sent, and any 
attempt to do so would damage the fidelity of the quantum information. 

Suppose that Alice wishes to send quantum information to Bob using the 
quantum analogue of the prefix-free code shown above. In other words, the 
length eigenstate zef codewords are 



message 



codeword 



Ci 

c 3 
c 4 





10 
110 

111 



state length 



|000) 1 

|100) 2 

|110) 3 

jlll) 3 
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Arbitrary superpositions of these codewords are also allowed codewords. To 
maintain the coherence of these superpositions, therefore, Bob must not ob- 
tain any information about the length of the codeword he receives. 

A quantum system actually used for the transmission of information must 
have at least two degrees of freedom. The first is the "data" degree of free- 
dom, which may for instance be a qubit. The second degree of freedom is the 
"location" degree of freedom. This is the physical degree of freedom which 
determines whether or not Bob has access to the data degree of freedom. 
The faithful transmission of a qubit in a state \ip) from Alice's location a to 
Bob's location 6 would be a process like this: 



Although we are phrasing our discussion in terms of the transmission of quan- 
tum information from one spatial location to another, this analysis would also 
apply to the storage and retrieval of information in a quantum computer. 
There the "location" degree of freedom might be the reading of a clock; the 
information stored at time a is to be retrieved at some later time 6. 

If we have several data qubits, each one will have a location degree of 
freedom (which may, of course, be correlated with the others). The number 
of qubits transmitted from Alice to Bob will be the number of location degrees 
of freedom that have evolved from a to b. For instance, suppose that three 
data qubits are in a joint state \ip 123 ), and that Alice sends the first and 
third qubits to Bob. The final state would be |^> 123 ,6a6), in which Bob has 
received two qubits. 

How could Alice send an indeterminate number of qubits to Bob — in 
particular, if Alice is representing her quantum information using the prefix- 
free quantum code above, how can she arrange to send only the first / qubits 
of a zef codeword of length /? The transmission of the length eigenstates is 
easy to describe: 



But imagine that Alice is sending a superposition of codewords of different 
lengths. If the above process is unitary, then at the end the data qubits will 



ip,a) -> \ip,b) . 



(17) 



000, aaa) 
100, aaa) 

110, aaa) 

111, aaa) 



000, baa) 
100, bba) 
110, bbb) 
111,666) . 
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be entangled with their location degrees of freedom. The coherence of the 
superposition would no longer be maintained within the data qubits. In order 
to restore the coherence, Bob would have to interact with the location degrees 
of freedom of the qubits with which he has indeterminate access. Except for 
a trivial case — in which Bob simply returns the qubits from location b back 
to a — he will not be able to do this. 

If the transmission process is not unitary, things are even worse. Our 
conclusion is that it is not possible to send quantum information coherently 
using an indeterminate number of qubits. If we are to use indeterminate- 
length quantum codes for quantum data compression, we will have to do so 
in such a way that a fixed number of qubits changes hands from Alice to Bob. 

Perfect fidelity would demand that Alice send all of the qubits to Bob — 
enough qubits so that even the longest component of each codeword is trans- 
mitted in its entirety. But this scheme would allow for no data compression 
at all. 

Our previous discussion of condensability offers some hope. The con- 
densation process took the "information-bearing" parts of N zef codewords 
(in registers of length l max ) and unitarily shifted them as far as possible to- 
ward the beginning of a tape of Nl max qubits. Although some branches of 
the overall superposition may extend to the end of the tape, the "typical" 
branch may be much shorter (followed by |0)'s). We therefore might be able 
to truncate the condensed string of codewords after some number L of qubits, 
where L <C Nl max , and still maintain an average fidelity approaching unity. 

Let us consider a quantum information source that produces an ensemble 
of signal states of some quantum system. These signal states are unitarily en- 
coded as zef codewords of some condensable quantum code. For our purposes, 
therefore, we can simply consider the ensemble of zef codewords produced by 
the quantum information source and the unitary encoding. In this ensemble, 
the codeword | a ze f ) occurs with probability p(a), and the average encoded 
signal state is described by the density operator 

P = J2p( a ) l a zef)(a ze f| • (18) 

a 

Our source produces a sequence of independent, identically distributed sig- 
nals, which are encoded as zef codewords in separate registers. The average 
state of N of these registers is p® N . 

The average length (I) of the codeword ensemble is 

(I) = TrpA = ^>(a) (a zef | A |a zef ) . (19) 
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The average length (I) is an ensemble average of quantum expectation values 
for A, but no codeword |a ze f) need be a length eigenstate. 

A condensed string of A" codewords is a zef string of Nl max qubits, with 
length observable A. If U is the unitary operator that maps the A" separate 
zef codewords to the condensed string, then we can define the overall length 
observable for the condensed string to be 

A = U(A 1 + A 2 + --- + A N )U-\ 

The condensed length A is just the sum of the individual length observables of 
the separate, pre-condensed codewords. This observable will have eigenvalues 
L — l\ + ■ — h 1^ and an average value (L). The codewords are independent, 
and so 

(L) = N(l). (20) 

Since the overall length of the condensed string is defined to be additive, 
we can apply the "law of large numbers" to some measurement of A: For 
any e, 5 > 0, for large enough A" it is true that 

Pr(|A- N(l) | > N6) < e. (21) 

This means that, for large N, the probability is very small that A will be 
found to be much less than (or much greater than) (L). Of course, we will 
not in general make such a measurement, but Equation ^1] is still useful in 
restricting the typical amplitude of codeword string components. 

As we shall see, if the ensemble average length of the zef codewords is (I), 
then we can in the long run maintain fidelity near to 1 by keeping just (/) +5 
qubits per signal, where 5 can be made as small as desired. Conversely, in a 
simple condensation process, we must keep at least (I) qubits per signal to 
maintain high fidelity — if we keep only (/) — 5 per signal, the average fidelity 
tends toward zero. We will also find that the ensemble average length of the 
zef codewords is related to the von Neumann entropy of the signal ensemble, 
making this approach an alternate route to the noiseless quantum coding 
theorem. Finally, we will show that the relative entropy is a measure of 
the additional resources (qubits) required to represent quantum information 
using a code that is not optimal. 

2.2 Enough qubits 

In this section we will make use of the fact that a condensed string of A" zef 
codewords is itself in zef form — in other words, we can view the condensed 
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string as a zef codeword in a much longer code. The length observable for this 
super- codeword will be the sum of the length observables for the N original 
codewords. 

Suppose we have a zef codeword \(f>) in a register of n qubits, and suppose 
£ < n. Define rj such that a measurement of the length observable A on the 
codeword yields a result larger than £ with probability 

Pr (A >t)=r]. (22) 

In general \<f>) will include components of various lengths. Let ILj be the 
projection 



11/ = 1 



i-t 



qM-1-tA /Qi+l-n 



(23) 



That is, ILj projects onto the subspace of register states that are |0) in the 
last n — £ qubits. We can write our zef codeword \<f>) as 

|0)=a|0 { ^)+/3|0 { ^) (24) 

where a, (3 > 0, and |0 ( ^)) and \(p^ e) ) are normalized states such that 

|0 ( ^)) = |0 ( ^)) 
rrj</W = o. 

Since all A-eigenstate zef codewords with length no larger than £ have |0) 
in the last n — £ qubits, 

1 -rj = Pr(A < £) < a 2 . (25) 

Equality need not hold, however, since some length eigenstate codewords 
with A > £ may nevertheless have |0) in the last n — £ qubits. (This is 
analogous to the classical situation, in which it is perfectly possible to have 
one or more 0's at the end of a codeword in a variable-length code.) 

We now imagine that we truncate the register by discarding the last n — £ 
qubits. Only £ qubits are stored or transmitted. At the receiver's end of 
the process, n — £ qubits in the standard state |0) are appended, yielding a 
mixed final state a for the register. With what fidelity F — (0| a \<f>) has the 
original codeword state been maintained by this process? 

Direct calculation shows that the mixed state a is 

a = a 2 |0 ( ^)) + /3 2 W(ye) (26) 
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where w { ^ t) is the state obtained by truncating \4>^ t) ) and appending n — £ 
qubits in the state |0). Thus 

F = (0| a |0) 

> a\ (27) 

Therefore, 

F > a 4 > (1 - r]) 2 > 1 - 2r). (28) 

If the codeword length A would be found to be no more than I with prob- 
ability 1 — 77, then we can keep only I qubits and recover the original state 
with fidelity F > l-2r]. 

We can now apply this result and the law of large numbers (Equation TL) 



to a condensed string of codewords. If e, 5 > and iV is sufficiently large, 
and if we take I = N((l) + 5), then the ensemble average probability that 
the codeword string is longer than I can be made smaller than e/2. We can 
therefore truncate the string after only N((l) + S) qubits and later recover 
the original string with an average fidelity 

(F) > 1 — e. (29) 

Therefore, if we keep more than (/) qubits per input message, in the long 
run we will be able to retrieve the quantum information with average fidelity 
approaching unity. The average length (/) tells us how many qubits are 
sufficient for high fidelity. 



2.3 Too few qubits 

We now turn to the question of how many qubits are necessary to achieve 
high fidelity after the condensed string is truncated. For this discussion 
we will restrict our attention to simple condensation, rather than a general 
condensation process. Since any condensable code can be replaced by a 
simply condensable code with the same length characteristics, this restriction 
is not too severe. 

The reason for making this restriction is pragmatic. Suppose we have 
iV registers containing codewords from a condensable code, with an average 
length of (/). A general condensation procedure might consist of two stages. 
In the first, the codewords in the N separate registers are unitarily remapped 
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to codewords from a more efficient code, that is, one with shorter average 
length (/') < (I). In the second stage, this more efficient code is condensed. 
We have established that only about N (/') qubits will be sufficient to main- 
tain high fidelity. In other words, the original average length (/) may tell us 
nothing about the number of qubits necessary for high fidelity. 

Of course, we might not choose to condense the codewords in this way, 
or a more efficient code might not exist. Our strategy will be to separate the 
question of the efficiency of a code from the question of how many qubits are 
necessary. First we will consider the simple condensation of codes that may 
be inefficient, and then (in the next section) we will discuss limits on the 
efficiency of codes. In this section, therefore, we describe limits imposed by 
the structure of our particular (possibly sub-optimal) code, and in the next 
we will indicate how optimal or near-optimal codes may be chosen. 

Begin with iV zef codewords of a simply condensable code. The simply 
condensed string formed from the N codewords can be built out of two pieces: 

1. the simply condensed qubit string obtained from the first N — k code- 
words, and 

2. the simply condensed qubit string obtained from the last k codewords. 

These two pieces are both zef and are simply condensed together to form the 
complete string. Thus, we will base our discussion on the simple condensation 
of just two zef codewords. 

The first zef codeword lies in a register of m qubits, and the second 
codeword \x) lies in a register of n qubits. The simply condensed pair (de- 
noted rather symbolically by \tpx)) * s a state of a string of m + n qubits. We 
also consider a state called | -00} , which is the first zef codeword followed by 
n additional qubits in the state |0). 

Let £ < m + n. The first zef codeword can be written 



where a, (3 > and \ip( <e) ) (or \ip(> t) )) is a normalized superposition of length 
eigenstates that are shorter than (or at least as long as) I. If we now simply 
condense this codeword with the codeword \x), we obtain 



1^) = <*\A<t)) + P\A>e)) 



(30) 



\M = a \A<t)X) + P \A>t)X) » 



(31) 
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with \4>( <e) x) and \4>(>e)X) being the simply condensed strings obtained from 
\x) and the two components of In a similar way, 

|^0) = a\^ <i} 0)+f3 |^)0). (32) 

Now we imagine truncating the string of m + n qubits, keeping only the 
first i of them to be stored or transmitted. (We can denote this process 
by %.) At the receiver's end, we do some unspecified quantum operation £ 
that results in a final state of m + n qubits. We know nothing about £ in 
general except that it is a trace-preserving, completely positive linear map 
on density operators. The overall process, applied to the two initial states 
\ipX) and \i/j0), yield 

|V0> £{a). 

At the end of this process, we are interested in the overall fidelity of the 
truncation- cum- recovery process: 

F=^ X \£(co)\^x)- (33) 

We will show that, under suitable conditions, this fidelity must be small. 
For general density operators, the fidelity is defined to be 

F( Pl ,p 2 )=max|(l|2)| 2 , (34) 

where the maximum is taken over all purifications |1) of p\ and |2) of p2- 
( Equivalent ly, we can fix one of the purifications |1) and maximize over the 
other purification |2).) The fidelity has the property that it is never decreased 
by any quantum operation, so that 

F(£( Pl ),£(p 2 ))>F( Pl ,p 2 ) (35) 

for any trace-preserving, completely positive linear map £. 

A useful result (shown in |T3[]) relates the fidelities among three states p±, 
p 2 and p 3 . Let F 12 = F( Pl ,p 2 ), etc. Then 
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This implies that, if _F 12 is nearly equal to one and F 23 is close to zero, F 13 
is also close to zero. Recalling that < F < 1 for all fidelities, we note that 
1 — \[F < 1 — F, and thus 

Fis < F 23 + 2(1 - F 12 ) + 2^23(1 - F 12 ) 

< F 23 + 2(1 - F 12 ) + 2^2(1 - F 12 ) 

< F 23 + 2y/l - F 12 + 2^1 - F 12 

Fi 3 < F 23 + 5^1 - F 12 . (37) 

Since this inequality is linear in both F 13 and F 23 , it will be convenient for 
situations in which we wish to average over an ensemble of p 3 states. 

We apply Equation [$7] to our situation as follows. The state p\ = 
the original simply condensed string, and the state p 3 = S(lu), 
the final state of the simply condensed string after the truncation % and 
the recovery operation £ . Playing the role of p 2 is the state £(&), the final 
state obtained by using |^0) as our input. Since the quantum operation 
£ can never decrease the fidelity between states, F(£(uj),£(a)) > F(u,a). 
Therefore, 

F = (i{>x\£(u)Wx) 

< (i>x\S(<T) \iP X ) + ^l-F(u,a). (38) 

The initial states \ipx) an d \ipO) are purifications of uj and a, respectively. 
The fidelity F(u, a) is thus 

F( W ,a) = max|(^l0 CT )| 2 (39) 

\<pa) 

where the maximum is taken over all purifications \<p a ) of a. Now, all of the 
purifications of a are related to one another by unitary operators that act 
only on the adjoined system, so that 

(40) 

with the maximum taken over all unitary operators acting on the last m+n— I 
qubits. 
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max 
u 



We write \ipx) = « W«dX) + P W(>t)X) and |^0) = a \ip (<e) 0) + (3 \ij ( > e) 0), 
as before, and note that, since IV^o) om y contains components of that 
are at least as long as £, 

Tr e+1 ... m+n \^ ( >i)X){^(>t)X\ = Tr £+1 ... m+n |^ ( > o 0)(^ ( > o 0| . (41) 

In this component, the second codeword, whose "starting address" in the 
simply condensed string is entangled with the length of the first codeword, 
lies entirely in the discarded tail of the qubit string. Therefore, there exists 
a unitary y +1 "' m+n such that 

l^,x) = (l 1 - £ ®^ +1 - m+ ")fe ) 0). (42) 

Clearly, 

F(u, a) >\(^x\ (l 1 "^ ® V e+1 - rn+n ) |^0) | 2 . (43) 



<Vxl (l 1 " ® V e+1 - m+n ) |^0) = a 2 (tP (<e)X \ (I 1 "* ® V e+ 



l"-m+n 



+af3{^ <e)X \(l l - e ®V e+1 - m+n ) |V(>,)0> 

) I^«o0> 



Therefore 



£+l-m+n 



) W 



+/5a(^ ( > x|(l 1 -'®V 
> 1 - 2a - 2a 2 > 1 - 4a. 



(+l---m+n 



F{uj,a) > (l-4a) 2 
> 1 - 8a. 

Our overall fidelity must satisfy 

F < (il;x\£(a)\ipx) + 5Vto 
< (^x\£{cj)\il>x) + l^. 



(44) 



(45) 



Neither the operator S(a) nor the parameter a depends on the second 
codeword \x)- We now imagine that the second codeword is drawn from an 
ensemble — that is, that the codeword \x) occurs with probability -P(x), so 
that the ensemble has an average density operator 



w = J2P( x ) \x)(x\- 



(46) 
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The average fidelity after truncation 7^ and recovery £ will therefore be 



F < Tr W£{a) + Ih^/a. (47) 

Since E(o) is a positive operator of unit trace, we obtain 

F < \\W\\ + 15\/a, (48) 

where ||W|| is the operator norm of W, which (since W is positive) is just 
the largest eigenvalue of W. 

After all of this, we are in a position to apply the law of large num- 
bers (Equation |2l|) again. We will be choosing two large integers, N and k. 
Our first codeword in the preceding analysis will be a simply condensed 
string of N — k codewords, and the second codeword \x) will be a simply 
condensed string of the remaining k codewords. We assume that the ensem- 
ble of single-register codewords has an average state p with more than one 
non-zero eigenvalue — in other words, the ensemble involves more than one 
codeword state. 

Let e, 5 > 0. If A < 1 is the largest eigenvalue of p, then the largest 
eigenvalue of p® k is \ k . Choose k so that X k < e/2. Since the last k codewords 
are unitarily condensed into a string with average state W, \\W\\ 
e/2. 

Now we consider the simply condensed string of the first N — k codewords, 
which we have denoted by \ip). The length observable for this string is A^-fc- 
Given a value of N, we define £ = N((l) — 5). We will restrict our attention 
to values of N large enough so that 

£<(N-k) (</> - ^ . (49) 

Applying the law of large numbers, we can now specify N large enough so that 
Pr (Aw-h < t) = a 2 is as small as we like. In particular, we can guarantee 
that 15^/a < e/2. Thus, 

F < \\W\\ + 15y/a < e. (50) 

Therefore, if we keep fewer than (I) qubits per input message and use 
simple condensation, in the long run the fidelity of the retrieved quantum 
information must approach zero. The average length (/) tells us how many 
qubits are necessary for high fidelity using simple condensation. 



< 
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2.4 Entropy and average length 



The preceeding results provide an interpretation for the average length (I) of 
an indeterminate-length quantum code: (I) is just a measure of the resources 
(qubits) that are both necessary and sufficient to maintain high fidelity of 
the quantum information, in the situations described above. We now inquire 
how short (I) can be for a given quantum information source. In other words, 
we will now explore how efficient an indeterminate-length quantum code may 
be. 

Recall the quantum Kraft-McMillan inequality (Equation |5|). Any con- 
densible quantum code must have a length observable A on zef codewords 
that satisfies 

Tr2~ A = K <1. 

where the trace is restricted to the zef subspace. We can construct a density 
operator u on the zef subspace by letting 

^ = 1 2 - A (51) 

The operator u, although a positive operator of unit trace, is generally not the 
same as the ensemble average density operator p of the codewords produced 
by the information source. 

The average codeword length (I) is 

(/) = TrpA 

= -Trplog^2 
= — Trplogcu- 
Therefore 

(l) = S(j>)+V(j>\\u)-logK, (52) 
where S(p) is the von Neumann entropy of the density operator p 

S(p) = -Tr plogp (53) 

and V (p\\u) is the quantum relative entropy 

V (p| \u) = Tr plogp — Tr plogcu. (54) 
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(We use base-2 logarithms.) The relative entropy has a number of useful 
properties. For example, it is positive-definite, so that V (p\\uj) > if and 
only if p ^ uj. 

Since log if < 0, 

(I) > S(p). (55) 

The average codeword length must always be at least as great as the von 
Neumann entropy of the signal ensemble from the information source. 

We can approach this bound by a suitable code. The eigenvalues Xk of 
p form a probability distribution A, and the von Neumann entropy is simply 
the Shannon entropy of the eigenvalues: 

S(p) = H(\) = -J2h\og\ k . (56) 

k 

The probability distribution A can be used to define a Shannon-Fano code, 
which is a classical prefix-free binary code whose codewords have integer 
lengths Ik = [log Afe] , so that 

Z fe < logAjfc + 1. (57) 

This means that the average length of the Huffman codewords satisfies 

(/> = £A fc Z fc <#(A) + l. (58) 
k 

The classical Shannon-Fano code can be used to define a corresponding 
prefix-free indeterminate-length quantum code, according to the procedure 
in Equation [L2[ (Such a code was also described by Chuang and Modha in 
0.) Eigenstates of p are length eigenstate zef codewords, and the average 
codeword length satisfies 

(l)<S(p) + l. (59) 

Asymptotically, this code will achieve high fidelity using about S(p) + 1 qubits 
per signal. 

An alternate scheme is based on Huffman codes, which are classical prefix 
free codes that actually minimize average codeword length (I) . Equations 



and ^ are also satisfied for Huffman codes and their quantum versions. 

We can do even better if we create our zef codewords from blocks of 
outputs of the quantum information source. This amounts to considering a 
new source that produces blocks of n elementary signals, with an ensemble 
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average block state p® n having an entropy of nS(p). A quantum Shannon- 
Fano or Huffman code designed for this block source would have an average 
length of no more than nS(p) + 1, so that we will use only S(p) + ^ qubits 
per elementary signal. Thus, by coding long blocks of signals, we can achieve 
F — > 1 with about S(p) qubits per elementary signal. 

It can be seen that the theory of indeterminate-length quantum codes 
provides an alternate route to the quantum noiseless coding theorem ||. 
The von Neumann entropy S(p) measures the physical resources necessary 
to represent quantum information faithfully. 

We now ask: under what circumstances can we achieve the entropic bound 
to the codeword length exactly, without resorting to block coding? In other 
words, for what codes and codeword ensembles can we have 

(0 = S{p)1 (60) 

A code for which this equality holds may be called "length optimizing" . The 
answer can be seen from Equation [5^: 

(l)=S(p)+V(p\\u)-\ogK. 

Both T> (p\\uj) and —logK are non-negative, so they must both equal zero 
for a length optimizing code. In other words, 

K = Tr2~ A = 1 (61) 

and 

p = cu = 2" A . (62) 

A length optimizing code must saturate the quantum Kraft inequality (Equa- 
tion H), and the codeword ensemble must equal the density operator ui con- 
structed from the length observable A. Two consequences follow: 

• Whenever the signal ensemble p has only eigenvalues of the form 2~ m 
for integer values of m, we can find a condensable quantum code (with 
length eigenvalues m) that is length optimizing. If p has eigenvalues 
that are not of this form, then no length optimizing code exists. 

• Some quantum codes saturate the quantum Kraft inequality — for ex- 
ample, those based on classical Huffman codes. These codes will be 
length optimizing for a codeword ensemble with density operator 

p = 2- A . (63) 
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That is, every quantum code that saturates the quantum Kraft inequal- 
ity is length optimizing for some codeword ensemble. If a quantum 
code does not saturate the quantum Kraft inequality, it is not length 
optimizing for any codeword ensemble. 

Suppose we have a code that is length optimizing for some density oper- 
ator uj; but instead, we use the code for an ensemble of codewords described 
by the density operator p. Then the average codeword length will be 

(l) = S(p)+V(p\\u>). (64) 

We know that, using block coding, we can asymptotically use as few as 
S(p) qubits to faithfully represent the quantum information produced by the 
source of p. We also know that (I) is the minimum number of qubits we need 
to retain per codeword to achieve high fidelity in a simply condensed string 
of many codewords. Thus, the relative entropy V (p\\uj) tells us what addi- 
tional resources (in qubits) are necessary to faithfully represent the quantum 
information from the p-source, if we use a code that is length optimizing for 
a different source (the "tu-source" ) . 

2.5 Remarks 

In the quantum Huffman code of Braunstein et al., codeword length informa- 
tion and the codewords themselves are stored separately, in entangled strings 
of qubits. This means that the average number of qubits used to store the 
quantum information from a given source is increased by an amount loga- 
rithmic in the codeword length |J. However, as we have seen, this separate 
accounting for codeword length information is unnecessary. The codewords 
of a quantum indeterminate-length code carry their own length information. 

This requirement is the basis for Equation |5], the quantum Kraft-McMillan 
inequality. We have shown that Equation |5| is a necessary and sufficient con- 
dition for condensability, and further, that any code satisfying Equation [| 
can be unitarily mapped to a prefix-free quantum code with the same length 
characteristics. Prefix-free codes are themselves simply condensable, and 
obey the quantum Kraft-McMillan inequality. 

Classical prefix-free codes are also called "instantaneous codes" , since the 
receiver of a string of codewords can identify an individual codeword from 
the string immediately, before the remainder of the string is received ||. 
But this terminology is inapplicable to the quantum case. Suppose we have 
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a simply condensed string of codewords from a prefix-free quantum code. 
The first codeword is generally not a length eigenstate, and the length of this 
codeword is entangled with the locations in the qubit string of all subsequent 
codewords. The phase relationship between the different-length components 
of the first codeword is a global property of the state of the entire string. 
Therefore, in order to coherently recover even the first codeword, we will 
need the entire string (or a sufficiently long initial segment to achieve high 
overall fidelity). Even prefix-free quantum codes are not "instantaneous"; the 
entire transmission must be completed before any part of it can be "read" . 

The classical Kraft-McMillan inequality (Equation [11]) arises whenever 
a set of binary strings satisfies the prefix-free condition. For example, it 
governs the set of lengths of distinct programs for a classical Turing machine. 
The Kraft-McMillan inequality therefore plays a central role in algorithmic 
information theory, in which the information content of a binary string s is 
defined to be length of the shortest halting program that produces s as its 
output |,[T|. We may hope that the quantum version of the Kraft-McMillan 
inequality will serve as a starting point for the development of a quantum 
algorithmic information theory. 

We are happy to acknowledge our indebtedness to many colleagues with 
whom we have discussed this work, including C. M. Caves, S. Braunstein, C. 
A. Fuchs, W. K. Wootters, T. M. Cover, and I. L. Chuang. One of us (BS) 
is grateful for the support of a Rosenbaum Fellowship at the Isaac Newton 
Institute for Mathematical Sciences in the summer of 1999. 
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